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We present a dynamic learning paradigm for "programming" a general quantum com- 
puter. A learning algorithm is used to find the control parameters for a coupled qubit 
system, such that the system at an initial time evolves to a state in which a given mea- 
surement corresponds to the desired operation. This can be thought of as a quantum 
neural network. We first apply the method to a system of two coupled superconducting 
quantum interference devices (SQUIDs), and demonstrate learning of both the classical 
gates XOR and XNOR. Training of the phase produces a gate congruent to the CNOT 
modulo a phase shift. Striking out for somewhat more interesting territory, we attempt 
learning of an entanglement witness for a two qubit system. Simulation shows a rea- 
sonably successful mapping of the entanglement at the initial time onto the correlation 
function at the final time for both pure and mixed states. For pure states this map- 
ping requires knowledge of the phase relation between the two parts; however, given 
that knowledge, this method can be used to measure the entanglement of an otherwise 
unknown state. The method is easily extended to multiple qubits or to quNits. 

Keywords: quantum algorithm, entanglement, dynamic learning 
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1 Introduction 

Recently there has been growing interest in quantum computing [TJ [5] . The possibilities seem 
vast. Beyond the improvements in size and speed, is the ability, at least in principle, to do 
classically impossible calculations. Two aspects of quantum computing make this possible: 
quantum parallelism and entanglement. While a computational setup can be constructed 
[3] which makes use of quantum parallelism (superposition) only, use and manipulation of 
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entanglement as well realizes the full power of quantum computing and communication [4.5, 

The major bottleneck to the use of quantum computers, once they are designed and built, 
is the paucity of algorithms that can make use of their power. At present, there are only a few 
major algorithms: Shor's factorization [Jj, Grover's data base search [8J, the Jones polynomial 
approximation 9J. It is not yet at all clear that a way will be or can be found to generate 
algorithms efficiently to solve general problems on quantum computers, as pointed out by 
Nielsen [IS], though some recent work [IS] 1 1 2 j using a geometric approach may prove fruitful. 

In previous work [3] , we have proposed the use of quantum adaptive computers to answer 
this need. An adaptive computer, since it can be trained, adapts to learn and in a sense 
constructs its own algorithm for the problem from the training set supplied. A quantum neural 
computer shows promise for constructing algorithms to solve problems that are inherently 
quantum mechanical. Here, we develop a dynamic learning algorithm for training a quantum 
computer and demonstrate successful learning of some simple benchmark applications. In 
addition, we show that this method can be used for learning of an entanglement witness[13 
for an input state. We show that our witness approximately reproduces the entanglement of 
formation for large classes of states. Generalization to systems of more than two qubits [H] . 
or to multiple level systems [15], is straightforward. 

2 Coupled Two-qubit System: QNN 

Two interacting qubits, labeled A and B, can be used to build a quantum gate where each 
qubit interacts with a coupling (connectivity) that can be externally adjusted. This is a 
dynamical system that can be prepared in an initial (input) state, which then evolves in time 
to the point where it can be measured at some final time to yield an output. Adjustable 
physical parameters of the qubits allow "programming" to "compute" a specified output in 
response to a given input. 

Consider a two-qubit quantum system that evolves in time according to the Hamiltonian: 
H = K A (JxA + K B o xB + sao?.a + e B o Z B + C^zA^zB (1) 

where {cr} are the Pauli operators corresponding to each of the two qubits, A and B, Ka and 
Kb are the tunneling amplitudes, ea and, Sb are the biases, and C the qubit-qubit coupling. 
This Hamiltonian can also be written [16] as 

H= En.^O.l^x.nJnx,^)^!,^! £ (|0> <1| + 1 1) <0|) <g> \n 2 ) (n 2 1 

n 2 =0.1 

-^f E l"i><»il® (10)<1| + |1><Q|) (2) 

Mi=0,l 

using the two-qubit charge basis |00), |10), |01), |11). This could represent a number of dif- 
ferent possible physical systems, e.g., trapped ions [17] or nuclear magnetic resonance [15]. 
Here we take as our physical model an electrostatically coupled two-SQUID system [TBI 119] . 
With appropriately timed pulses of the bias amplitudes ea and Eb , entangled states can be 
prepared, and logic gates such as the CNOT or Toffoli reproduced [T51 [20] ■ While we consider 
here a SQUID system with the particular physical identifications, above, of the parameters 
in the Hamiltonian, the method is by no means limited to that physical implementation of a 
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two-quibit system, or, indeed, to a two-qubit system. It is easy to see how to generalize to 
any Hamiltonian containing appropriately adjustable parameters. 

The density matrix of the system as a function of time obeys the Schrdinger equation 
5? = jj-[H,p] , whose formal solution is p(t) = exp(iLt)p(t). This is of similar mathematical 
form to the equation for information propagation in a neural network, as follows. For a 
traditional artificial neural network, the calculated activation & of the i th neuron is performed 
on the signals {4>i} from the other neurons in the network, and is given by <pi = ■ Wij fi(4>j)i 
where is the weight of the connection from the output of neuron j to neuron i, and /; is a 
bounded diffcrentiable real valued neuron activation function for neuron j . Individual neurons 
are connected together into a network to process information from a set of input neurons to a 
set of output neurons. The network is an operator Fw on the input vector 4>i np ut- Fw depends 
on the neuron connectivity weight matrix W and propagates the information forward to 
calculate an output vector (j> ou tput'i that is, (^output — Fw4>input- The time evolution equation 
for the density matrix maps the initial state (input) to the final state (output) in much the 
same way. The parameters playing the role of the adjustable weights in the neural network 
are the set {Ka, Kb, £a, £s, C}j au °f which can be adjusted experimentally as functions of 
time for the SQUID system under consideration p~6j[19]. By adjusting the parameters using 
a neural network type learning algorithm we can train the system to evolve in time to a final 
state at the final time tf for which the desired measure has been mapped to the function we 
wish the net to compute: logic gates, or, since the time evolution is quantum mechanical, a 
quantum function like the entanglement. Indeed, if we think of the time evolution operator in 
terms of the Feynman path integral picture [21 , the parallel becomes even more compelling: 
Instantaneous values taken by the quantum system at intermediate times, which are integrated 
over, play the role of "virtual neurons" [SJ. 

In any case the real time evolution of the two-qubit system can be treated as a neural 
network, because its evolution is a nonlinear function of the various adjustable parameters 
(weights) of the Hamiltonian. For as long as coherence can be maintained experimentally, 
it is a quantum neural network (QNN). Thus, if we can find values for the parameters such 
that the set of {inputs, outputs} matches a measure of entanglement, we can use this setup as 
an experimental means of measuring the entanglement of any prepared state of the system, 
whether that state is analytically known or unknown. In this paper we find those parameters 
by training a simulation of the two-qubit system to a set of four input-output pairs. We 
then test the simulated net on a large number of additional states. The net is said to have 
generalized if the results on the testing set are correct and consistent. We show that this is 
so. 

3 Dynamic Learning for the Quantum System 

The goal of learning as applied to this quantum system is to control the system via the external 
parameters (tunneling, field and coupling values) to force it to calculate target outputs in 
response to given inputs. This is essentially a neural network supervised learning paradigm 
extended to the quantum system. The method, derived below, follows the methodology of 
Yann LeCun's Lagrangian formulation derivation of backpropagation [55] and Paul Werbos's 
description of backpropagation through time [53J , and follows some of our earlier work [2H [231 
2G on learning in non-linear optical materials and in training of quantum Hopfiekl networks. 
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Our method uses the density matrix representation for generality. 

We derive a learning rule for the quantum system based on dynamic backpropagation for 
time dependent recurrent neural networks. Given an input (initial density matrix), p(0), and 
a target output, d, a training pair from a training set, we want to develop a weight update rule 
based on gradient descent to adjust the system parameters (tunneling, field and coupling), 
i.e., train the system "weights", to reduce the squared error between the target, d, and the 
output, Output. While training the weights, the system's density matrix, p(t), is constrained 
to satisfy the Schrdinger equation for all time in the interval (0,i/). 

We define a Lagrangian, L, to be minimized as 

L = g[d- <0(*/)>] 2 + J o A+(t)(^ + -[H,p]) 7 (t)dt (3) 

where the Lagrange multiplier vectors are X + (t) and j(t) (row and column, respectively), 
and O is an output measure (or some function of a measure), which can be specified for the 
particular problem under consideration and is defined as: 

Output = (0(t f )) = tr[p(t f )0] = J2Pi\i>i(tf))^i(tf)\0 = Y,Pi(Mtf)\OMtf)) (4) 

i i 

where tr stands for the trace of the matrix. We take the first variation of L with respect to p, 
set it equal to zero, then integrate by parts to give the following equation which can be used 
to calculate the vector elements of the Lagrange multipliers that will be used in the learning 
rule: 

djj d\i i \ TT i \ , TT ... 

li ~dt + ~dt l3 XkHk ^ + Ji^ XiH ^ k = ( 5 ) 

k k 

with the boundary conditions at the final time tf given by 

-[d- (0(t f ))]Oji + Xiitfhjitf) = (6) 
The gradient descent learning rule is given by 

Wnew = W ld ~ r/— (7) 

for each weight parameter w, where r\ is the learning rate and 

9L i f tf x , . ..dH n i f*f v^ /x . .dHik w . dH kj . , 

^ = hJ x+ ^^ dt =nJ E^Wftf^i-MtW-sfTi)* (8) 

ijk 

Note that because of the Hermiticity of the Hamiltonian, H, and the density matrix p, 
A,*7j = Xjji and the derivative of the Lagrangian, L, with respect to the weight, w, as given 
by (8), will be a real number. This simplifies the calculation somewhat. 

The time evolution of the quantum system is calculated by integrating the Schrdinger 
equation in MATLAB Simulink [57]. The ODE4 fixed step size solver was used with a step 
size of 0.05 ns. Discretization error for the numerical integration was checked by redoing the 
calculations with a timestep of a tenth the size; results were not affected. Since the error 
needs to be back propagated through time, the integration has to be carried out from tf to 0. 
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To implement this in MATLAB Simulink, a change of variable is made by letting t' = tf — t. 
Instead of using Simulink, the Schrdinger wave equation can also be integrated with any 
standard numerical method such as a FORTRAN or C-code program, which we have also 
done, to validate the MATLAB simulation. 

4 Benchmark Training Results for Two-qubit Gates 

We now train a two-SQUID quantum system in simulation, to produce two-input one-output 
classical logic gates, specifically the XOR and XNOR. The system is initialized to (prepared 
in) the input states shown in Table 1 and allowed to evolve for 300 ns. We choose as the 
output measure O the state of the second qubit, B, at the final time (i.e., O = <J Z B{tf)-)- 
We call B the "target" qubit (while A is the "control" qubit.) The measure is applied to 
find the state of the system at this final time tf and compared to the target output for 
that particular input. The parameters A and 7 are initialized with this error according to 
(6), and this is dynamically back propagated through time according to (5) and the weights 
updated according to (7). Since the control qubit A does not change its state and we are only 
concerned with measuring the output state of the target qubit B at the final time, we do not 
need to train the weights corresponding to the parameters for the control SQUID A, i.e., Ka 
and sa need not be trained. Thus, we only train the weights corresponding to the parameter 
values Kb, Sb and Cab- The value for ea can be chosen to be any arbitrarily high value, 
say 1.0 GHz; this maintains the control qubit A in its original state. The initial values of the 
parameters (weights) before training are shown in Tables 2 and 3 for the XOR and XNOR 
gates, respectively, along with the final trained parameters (weights) for each logic gate. The 
field Sb is allowed to vary with time. The simulation models this by allowing the field to vary 
every 100 ns, as a series of step functions: £b(1) and £b(2). The trained responses are shown 
in Table 1 along with the RMS error for each logic gate and the number of epochs of training 
used for each. The trained parameters agree with our previous analytic results [20]. Note that 
the parameter values can be rescaled overall. 

Table 1. Training data for two input one output logic gates. 



Input 


XOR 


XNOR 




Target Output 


Target 


Output 


[00) 


-1 -0.9919 


+1 


0.9902 


[01) 


+1 0.9920 


-1 


-0.9903 


|10) 


+1 0.9902 


-1 


-0.9919 


111) 


-1 -0.9903 


+1 


0.9920 


RMS 


0.00446 


0.00447 


Epochs 


300 





5 Quantum Control 

It should be noted that our chosen measurement, O = cr Z B(tf), does not check the phase of 
the final state. That is, Tables 1-3 show only the acquisition of the classical gates XOR and 
XNOR. In order to show that we have a quantum gate we need to train the actual output 
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state, directly. That is, given an input state, we wish to adjust the parameters of the system 
such that it evolves to another given state. This is also known as "quantum control" [28 . Of 
course, we cannot use the density matrix approach if we wish to do this. A similar training 
procedure using kets is as follows. The output of the quantum system is taken as the overlap 
of the state of the system at the final time, tj, with the desired state: 

Out = {lpdesired\^{tf)) ■ (9) 

We define a Lagrangian, L, to be minimized as 

L = \\l - (^reMitf))? + ~ ^H\mt + - ^H\i,)]*\dt , (10) 

where * indicates the complex conjugate. To minimize L, we set the first variation of L with 
respect to I?/') equal to zero. After integration by parts, this gives a differential equation for 
A as 

s-b™- (11 > 

with a condition at the final time as 



X*(t f ) = Re[(l - m f )\rjj des ))& des \] . (12) 

Again, we can solve this equation backward in time, and take the variation of L with 
respect to a weight parameter, w, where w represents any one of the parameters from the 
Hamiltonian, H, such as Ka, Kb, etc., to give the parameter (weight) update learning rule 
(7), where now 

and T) is again the learning rate. Training using this method is less stable than with the 
density matrix; in particular, if we try to pin the control qubit A as before, with a large value 
of £a, there are rapid oscillations in the phase which make computation difficult. Fortunately 
one can get the same pinning result by setting Ka — 0. (Use of the density matrix method 
gets rid of this phase oscillation problem but of course does not allow the user to determine 
the phase of the output.) 

Results are shown in Table 4. This is not the CNOT, since one of the qubits has an 
extra phasepj] of it; however, this gate, followed by the "controlled Z" gives the CNOT. (The 



Table 2. Initial and trained parameters (weights) for XOR gate, in MHz. 



Parameter (MHz) Initial XOR-trained 



K A 


2.1333 


2.1333 


K b 


2.1333 


1.2684 


Cab 


0.1 


-0.97981 


e A (l)=e A {2) 


1,000 


1,000 


e B (2) 


0.1 


1.0518 


£B(3) 


0.1 


1.0534 



18 Quantum algorithm design using dynamic learning 



solution for single-qubit phase shifts gates is obvious: set £ab = , and let Eg be nonzero for 
the amount of time necessary.) 

It should be noted that this, like the application in the previous section, is a "blind" 
application of our method. Recent work p~2] especially by Khaneja, et at, using a geometric 
approach, shows that careful analysis can produce a much more optimal (efficient) realization 
of a quantum operator. Still, our method allows relatively quick learning, and can be readily 
applied even when the problem -or operator -is not well understood, as with the entanglement 
witness in the next section. 

6 Entanglement Witness 

Of course it is known how to produce simple universal gates with any of a number of 
physical implementations, and as long as we know how to decompose our desired computation 
into those simpler blocks we do not need to do it "all at once." Showing that our quantum 
neural network can be trained to do progressively more complicated gates does have some 
advantages: Online training automatically adjusts for small effects not taken into account in 
whatever model was used to design the algorithm, for example. 

However of much more interest is the possibility that a neural or AI approach can help 
us calculate things we do not have an algorithm for, and/or which we do not know how to 
decompose into simple gates. Entanglement is a good example. It is widely thought that the 
power of quantum computing and communications relics heavily on the use and manipulation 
of entanglement [6] . But the quantification of entanglement is still not fully understood even 
for pure states, and for mixed states the situation is cloudier. For many practical applications 
we will also need an extension to systems of more than two qubits |14j or to entangled pairs 
of N-state systems, "quNits" [IS] . 

Two prominent universal measures do exist: those of Bennett et aZ.[29] and Wootters|30j. 
and of Vedral et aL[3Tj. The first is the entanglement of formation, which for pure states is 
equal to the von Neumann entropy of the reduced single-qubit states of the two-qubit system. 
The second constructs a space of density matrices, containing a subspace of unentangled 
states, and defines as the entanglement of a state the minimum distance of its density matrix 
to that subspace. The two measures do not give the same answer for a number of states, 
but each is internally consistent. We now use our training method to map an entanglement 
witness to a single (local) measure at the final time [32 . Unlike the method of Vedral, it 
does not require a minimization procedure, which can become cumbersome especially if the 
number of qubits is large; unlike both, it can in principle easily be used experimentally to 

Table 3. Initial and trained parameters (weights) for XNOR gate, in MHz. 
Parameter (MHz) Initial XNOR-traincd 



K A 


2.1333 


2.1333 


K b 


2.1333 


1.2682 


Cab 


0.1 


0.97973 


sa(1) =e A {2) 


1,000 


1,000 


es(2) 


0.1 


1.0508 


£b(3) 


0.1 


1.0524 
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measure the entanglement of an unknown state. We show that our method approximately 
reproduces the entanglement of formation for large classes of states. 

We return to a two-qubit system, for which the Hamiltonian is given by (1). We take 
as our output the square of the two-qubit correlation function at the final time, that is, 
(0(tf)) — (&zA(tf)<JzB(tf)) 2 = [tr(p(tf)<T z A<T Z B)] 2 ■ (We use here the square of the correlation 
function so that the range of the output will be [0,1] for convenience; the only modification 
necessary in Eq. (8) is the multiplication by 2{a z A(tf)a z g(tf)).) In Table 5 we present a 
representative sample of the kinds of states whose entanglement we wish to calculate, along 
with the entanglement of each as calculated by the methods of references [291 1501 15T| . The 
Bell and EPR states are maximally entangled, and product and (completely) mixed states 
are minimally entangled. We chose a training set of four: one completely entangled state, 
one unentangled state, one classically correlated but unentangled state, and one partially 
entangled state. We used a time of evolution of 1000 Ti, enough for the system to go through 
one complete oscillation. Results are presented in Table 6. The RMS error for the set, after 
training, is essentially zero. Table 7 lists the parameter values the system trained to in order 
to achieve the entanglement witness. We then tested the witness on a testing set which 
included a maximally entangled state of a type not seen before by the net, the EPR state; a 
pure unentangled state; a correlated unentangled state, a different partially entangled state, 
and a mixed state. Results for the testing set are shown in Table 8. The error for the testing 
set is also essentially zero. A large number of permutations of possible states have been tried 
with exactly similar results. The quantum neural net has learned to compute an approximate 
general measure of entanglement. 

The training for the partially entangled state P deserves some further comment. We 
noted, above, that there is no general agreement on what the entanglement of P is, though it 
ought to lie somewhere between and 1 on the scale we are using. Therefore we trained the 
network for various different target values for the entanglement of P, including the numbers 
(0.32, 0.46, 0.55) calculated by the three comparison methods shown in Table 5. In Figure 
1 we show the total error of the QNN for both the training and testing sets, as a function 
of the desired value. There is a minimum at approximately 0.44317 (though that degree of 
precision is probably imaginary.) What this means is that using the value of 0.44317 for 
the fourth training pair significantly increased the compatibility of the set of training and 
testing pairs, taken together; while we could train the state P to any value we desire, only 
if we set that value to 0.44317 can we get at the same time the values we wish for the other 
training and testing pairs. We can say that in some sense the value of about 0.44 is the 



Table 4. Training of the CNOT. 



Input 


Output amplitudes 




w 


100} |01} |10} 


|H) 


|00> 
|01> 
|10> 

111) 


0.9998 + 0.009i 0.0006 + 0.0188i 0.0000 
0.0006 + 0.0188i 0.9977 - 0.0653i 0.0000 
0.0000 0.0000 0.0323 + O.Oi 
0.0000 0.0000 0.9993 + 0.0203i 


0.0000 
0.0000 
-0.9993 + 0.0202i 
0.0292 - 0.0085i 


K A = 0.0 = e A (l) = s A {2); Cab = 0.0096 
K B = 0.0054; e B = (-0.0606, -0.0133, -0.0055) 
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Table 5. Some possible states of the two-qubit system. The relative amplitudes (for the ket 
states) are given without normalization for clarity. The first two are maximally entangled. The 
second two are product states ( flat = (|0> + |1))a(|0) + \1))b and C = |0>a(|0> +7|l»s) and 
thus have zero entanglement; and P is partially entangled. M is a mixed state and cannot be 
expressed as a ket; its density matrix is given instead. The classical correlation is computed as 
( ,t za(0)< t zs(0)}. C and M are classically correlated but not entangled. The last three columns 
show the entanglement as calculated by the methods of Bennett |29| and Vedral [31| . using the von 
Neumann metric and the Bures metric. Both distance measures have been normalized to unity. 



State 


Relative amplitudes of 


Classical 
Correlation 


Entanglement 


|00> |01> |10} |11} 


Bennett von Neumann Bures 


Bell 

EPR 

flat 

C 
P 


1 e ie 

1 e lB 
1111 

1 7 


1 

-1 


i-ItI 2 
-1/3 


1 1 1 
1 1 1 



0.55 0.32 0.46 


M 


(|00)<00| + |il)(il|)/2 


1 






Table 6. Training data for QNN entanglement witness. 



Input state 


Initial 


Desired 


Trained 


Bell, <5 = 


1.0 


1.0 


0.99997 


Flat 


0.0 


0.0 


2.01 x 10~ 6 


C, 7 = 0.5 


0.36 


0.0 


2.61 x 10 -5 


P 


0.11 


0.44317 


0.44317 


RMS 
Epochs 




1.08 x 10" 5 
2000 





Table 7. Initital and trained parameters for entanglement, in MHz. 



Parameter (MHz) Initial Trained 



Ka(1) 


2.5 


2.3576 


K A {2) 


2.5 


2.3576 


K A (3) 


2.5 


2.3577 




2.5 


2.3461 


K B (1) 


2.5 


2.3576 


K B (2) 


2.5 


2.3576 


K B (3) 


2.5 


2.3576 


K B (4) 


2.5 


2.3546 


C(l) 


0.1 


0.045026 


C(2) 


0.1 


0.10117 


C(3) 


0.1 


0.10771 


C(4) 


0.1 


0.044221 


e A {l) 


0.1 


0.10913 


ex(2) 


0.1 


0.03768 


ea(3) 


0.1 


0.08671 


ea(4) 


0.1 


0.071464 


e B (l) 


0.1 


0.10913 


en (2) 


0.1 


0.063774 


es (3) 


0.1 


0.038802 


es(4) 


0.1 


0.072387 
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Table 8. Representative testing results for the quantum neural network. P2 is the state (|00) + 
1 10) + |ll>)/\/3. Parameters used are listed in Table 7. 



State 


Desired 


QNN Output 


EPR, e = 


1.0 


1.0 




EPR, 9 = n 


1.0 


1.0 




Bell, 8 = n 


1.0 


0.99997 




|00> 


0.0 


3.24 x 10" 


5 


|10> +0.9|11> 


0.0 


3.39 x 10" 


6 


Pi 


0.44317 


0.44317 




M 


0.0 


2.59 x 10" 


13 


RMS 


7.68 x 10"° 







"natural" value for the entanglement of this state, at least as computed by this net; that 
is, using this value as the target value for P leads to the greatest possible self-consistency 
for the method. Interestingly 4/9= 0.44444 is the value for the entanglement of state P as 
calculated by the formula tr(pp), where p — (a y £g> <Jy)p*(a y <g> a y ), which for pure states is a 
monotonically increasing function (like the concurrence) and thus could be used as a possible 
measure of entanglement. This measure, however, fails for mixed states: in particular, it gives 
an entanglement of 1 for the mixed state M (Table 5). (Coincidentally the number 0.44229 
for the entanglement of formation also has some special significance [33] for partially mixed 
states: this is the point at which the Werner states[34 begin to violate the Bell inequality.) 




Fig. 1. Total error, including training and testing, for different target values for the partially 
entangled state P. In each run, represented by a data point on the graph, all four of the training 
set pairs were trained, and only the desired value for the state P was changed; all were tested on 
the same testing set as shown in Table 8. The minimum error found, 6.27 X 10 — 6 , occurs at a 
target value of 0.44317. 
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It may also be of interest to note that almost all the contribution to the error comes from 
the partially entangled states in the training and testing sets. So, for example, the total error 
(from training and testing sets) for the target value of 0.32 for P is about 0.2; most of this 
comes from the testing of state P2, for which the net calculates 0.41942 instead of 0.32. That 
is, except at the target value of about 0.44, the net does not recognize P2 as being "the same" 
as P. Results for the mixed states are almost exactly the same irrespective of the target value 
for P: if they were plotted in Figures 3 and 4, they would lie almost on top of the QNN 
values shown. The results for completely mixed states are always calculated to be zero (10 -6 
or smaller.) 

Once the net is trained it is then a simple matter to calculate the output value (square 
of the correlation function) which would be measured at the given final time for any state 
at all, pure or mixed. In addition to the testing set, we set up a grid to check that the 
calculated entanglement of all pure product states is zero, that is, all states of the form 
(a|l) + /5|0>) a (t|1) + <5|0)) B /V"7 + ol6 + W+ (36. The RMS error for the set of 10,000 was 
1.1 x 10~ 7 . Similarly we tested a grid for mixed states; the RMS error for the set of 10,000 
was 2.6 x 10" 7 . Thus for both pure and mixed separable states the QNN gives reasonably 
good results. 

We compare the QNN method with both a (widely-accepted) measure (Bennett and Woot- 
ters's entanglement of formation) and a witness (Toth and Guhne's local entanglement witness 
[55] . WghZui which for the two-qubit system considered here is equal to I— o x a<JxB— &zA&zb ) 
This is because, while we claim to have designed only a witness, not a measure, our method 
turns out to be somewhat better than expected: to some extent, it gives information, as well, 
on the amount of entanglement present. When we compare the QNN to the entanglement 
of formation, we are looking for numerical agreement and get it to some extent; when we 
compare both Toth and Guhne's W and the QNN, as witnesses, to the entanglement of for- 
mation, we look, as is appropriate, only for agreement as to whether or not entanglement is 
present. Note that this last measures proximity to the Bell triplet state and that a negative 
number indicates entanglement. 

Figure 2 shows the QNN entanglement for the pure state ^3(7) = l °)+l^) + ^l 01 ) ; as a 

function of 7. Values for the entanglement of formation are also shown for comparison; the 
results are very similar and constitute good agreement. The W witness correctly indicates the 
presence of entanglement until the contamination gets too large: for 7 = 1 W is zero, which 
indicates no entanglement and is incorrect. Thus here the QNN does better as a witness than 
W . The disagreement is probably because the W witness is only good for states close to the 
Bell triplet state |<I> + ); for 7 = 1 we are probably sufficiently far from the Bell triplet state 
that the witness no longer applies well. 

Figure 3 shows the calculated entanglement for the Bell triplet Werner [34] mixed states 
as a function of fidelity, again, with Bennett and Wootters's entanglement of formation and 
Toth and Guhne's witness W for comparison. Again, agreement of the QNN with the former 
is quite good though not exact. For fidelity between approximately 0.28 and 0.5 our method 
gives a small but nonzero entanglement, which is incorrect, since it has been shown[29] that 
for 0.25 < F < 0.5 the Werner state can be written as a mixture of product states. Here 
W is a better witness than the QNN. In Figure 4 we show the QNN entanglement for the 
states M'( 7 ) = ^ n " n ' + + f and compare those results to those for the entanglement 
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of formation. Again the agreement is good though not exact. The W witness here fails for 
7 > 0.5. 

Using a larger training set -one that includes mixed states -does reduce the error shown in 
Figure 3, though with this model (as Figure 1 demonstrates) we have already minimized the 
error shown in Figure 2 by optimizing the target value for P. Changing the target value for 
P does not alter the results for Figure 3 significantly. We thought it more interesting to show 
results for what seems to be the minimum training set for an (approximate) entanglement 
witness. 



Ket: (|1 1 >+|00>+gamma|01 >)/sqrt(2+gamma A 2) 




gamma 



♦ QNNiW Bennett-Wootters 



Fig. 2. Entanglement of the pure state P3(j) = ^°°^2+| + p ^ ' aS a f unct ' on °f T> as calculated by 
the QNN, using the trained parameters listed in Table 7, by Bennett's entanglement of formation 
|29II30I ; and by Toth and Guhne's local entanglement witness W [35 j . Note that W < indicates 
entanglement. 



7 Discussion 

It will have been noticed that all the coefficients on states so far trained or tested were 
real. It is a natural question at this point to ask about a phase difference, e.g., between the 
two parts of a Bell state. In Figure 5 we show the calculated correlation function for the 
Bell state |00) + e lS |ll) as a function of initial phase difference 9. (The EPR state shows 
exactly the same dependence.) Of course the actual entanglement is not a function of the 
phase difference, so here the QNN measure is wrong; or rather, this is the major reason our 
method produces only a "witness" not a "measure." Interestingly Toth-Guhne's W witness 
shows a similar oscillation, though at half the frequency. 

Product states show a similar if smaller amplitude oscillation. Figure 6 shows that the 
oscillation in the product state C is so small as to be negligible; W correctly predicts no 
entanglement though we are surely too far from the Bell state for the method to be applicable. 
Figure 7 shows the oscillation for P2 when the phase difference is between the entangled pair 
and 1 01). The "correct" target value for the P states depends on the phase difference, and is 
equal to 0.44317 only for = 0; the mean value is very close to 4/9 (0.44444 to five significant 
figures, for twenty evaluated points between and 2ir.) Figure 8 shows the oscillation for 
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Fig. 3. Entanglement for the Bell triplet Werner mixed states F|$+) (#+ 1 + (1 - F)/3)(| *+)(*+ 1 + 
|*-)(*-| + as a function of fidelity F, where = (| U} ± | J.T»/"v/2, and |4>±) = 

(I TT> ± I H>)/V2, as calculated by Bennett /Wootters [3U], by QNN, and by Toth/Guhne's 
W witness [35] ■ Again the QNN parameters were as listed in Table 7. The Werner states are 
x = (4F — l)/3 parts pure triplet (fully entangled), and (1 — x) parts identity operator (completely 
mixed) 29], Note that W < indicates entanglement. 
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Fig. 4. Entanglement for the states M'(j) = (-y|01) (01| + |* + >(* + |)/(l + 7), calculated by 
entanglement of formation, by QNN, and by Toth/Guhne's W. For 7 = 1 this is Bennett's M 
state; the QNN calculates its entanglement as 0.24716, while Wootters's method 30 gives 0.35458. 
QNN parameters were as listed in Table 7. Note that W < indicates entanglement. 
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Pi when the phase difference is within the entangled pair. The oscillation from Figure 5 is 
reproduced, as expected, but on a smaller amplitude scale ( to 0.44 rather than to 1) 
because of the presence of the amplitude in the 1 01) state. In this case, as in Figure 5, we can 
see the oscillation in the W witness as well. Note that for this state, for any value of 9, the 
W witness predicts no entanglement; this is doubtless due to its being insufficiently "close" to 
the Bell state. The best point is at 9 = (or 2tt); this is the same as the point at 7 = 1 in 
Figure 2. 



Ket: (exp(i*theta)|1 1 >+|00>)/sqrt(2) 
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Fig. 5. Calculated entanglement for the state |00) + e ie |ll), as a function of 8, by Bennett- 
Wootters, QNN, and Toth-Gutme's W. QNN parameters were as listed in Table 7. Note that 
W < indicates entanglement. 
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Fig. 6. Calculated entanglement for the C state, by Bennett-Wootters, QNN, and Toth-Guhnc's 
W . QNN parameters were as listed in Table 7. Note that W < indicates entanglement. 



We can see why both witnesses behave this way by considering that any single measurement 
must be of the form of the trace of the initial density matrix times some Hermitian operator. 
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Let that matrix be given by 



of the initial state's being 1 00) 



/ ex 

ai — ibi 
a 2 - ib 2 
\ a 3 - ib 3 



a\ + ib\ 
e 2 

ci — idi 
c 2 - id 2 



a 2 + ib 2 
c\ + id\ 

e 3 

c 3 - id 3 



a,3 + ib 3 
c 2 + id 2 
c 3 + id 3 

e 4 



Now, for the case 



this means that the expectation value is given by 
ei + G4 + 2Re[e l9 (a3 — ib 3 )]. Unless both 03 and 63 are zero this necessarily depends on 
8; even if we set them both to zero we still have to deal with, e.g., the case of the state 
|01) + e l9 |10) . Thus it is not possible to design a single measurement as a completely general 
entanglement measure. The oscillation that is seen in both our witness and that of Toth 
and Guhne is inescapable. Thus in order to use our method to measure independently the 
entanglement of an unknown state, it is necessary to do at least one other measurement and 
perhaps two. Recently Yang and Han[36] have devised a means of extracting an arbitrary 
relative phase from a multiqubit entangled state by local Hadamard transformations and 
measurements along a single basis; this method together with our own, then, can be used as 
an unambiguous entanglement witness. 
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Fig. 7. Entanglement of the state |00> + |11> + e ie |01), , as a function of 8, as calculated by QNN. 
QNN parameters were as listed in Table 7. Toth-Guhne's W is zero for all values of 9 (too far 
from the Bell state; see Figure 2.) 



One possible problem with the proposed method is that the output function chosen is 
always greater than or equal to zero; thus, in actual experimental measurements, it will 
systematically overestimate the entanglement for states close to separability. It should also 
be noted that the correlation function cannot be determined in a single measurement, since it 
is an average quantity. To measure the correlation function experimentally, even if the average 
should be very close to zero or to one, it would be necessary to produce the desired state many 
times; since in a given situation it may not be possible to do so, or without varying the phase 
difference, this may defeat the purpose. Nevertheless we believe this approach may be a 
fruitful one: it is certainly easier to measure the correlation function than it is to determine 
the density matrix in full, as standard calculational approaches require, or even the four 
parameters necessary to find the concurrence [3 7j . It is possible that a different, more clever 
choice of measurement operator (s) could reduce the number of measurements necessary still 
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Ket: (exp(i*theta)|1 1>+|01>+|00>)/sqrt(3) 
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Fig. 8. Entanglement of the state 1 00) + e l9 |ll) + |01), as a function of 9, as calculated by Bennett- 
Wootters, QNN, and Toth-Guhne's W. QNN parameters were as listed in Table 7. Note that 
W < indicates entanglement. 

further. In addition, generalization to multiple qubits or to quNits is straightforward: as 
long as a training set of sufficient size is used there is no reason to think that the net would 
be unable to learn the generalized measure. Our experience here seems to show that the 
necessary size is not large. We are currently pursuing these lines of research. 

8 Conclusions 

In this paper we have developed a general dynamic learning algorithm for training a 
quantum computer, for either pure or mixed states. We have demonstrated successful learning 
of some simple benchmark applications. We have also shown that this method can be used 
for the learning of an entanglement witness for an input state. We have shown that our 
witness approximately reproduces the entanglement of formation for large classes of states, 
and, while it suffers from a systematic oscillation problem, so must every single-measurement 
entanglement witness, and a method like Yang and Han's can be used in conjunction to take 
care of the problem. It is superior to other witnesses in that, first, the state need not be 
"close" to a particular kind of entangled state or, indeed, to any particular state; and, second, 
that the state may be completely unknown. Generalization to systems of more than two 
qubits and to multiple level systems is in progress. 
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